Memory channel detector systems and methods

ABSTRACT

A method for determining decision metrics in a detector for a memory device. The method includes receiving a plurality of signal samples and extracting a set of statistics from the signal samples, wherein at least one of the statistics is non-linear or complex, is derived from a plurality of the signal samples, and is not a function of at least one real linear statistic that is derived from a plurality of the signal samples. The method also includes applying at least one decision metric function to the set of statistics to determine at least one decision metric value corresponding to at least one postulated symbol.

CROSS REFERENCE TO RELATED APPLICATIONS

The present application claims priority to U.S. Provisional PatentApplication No. 61/753,853, filed Jan. 17, 2013; and United StatesProvisional Patent Application No. 61/804,154, filed Mar. 21, 2013.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with government support under Grant Nos.ECCS-1128705, CCF-1018984, and EECS-1029081 awarded by the NationalScience Foundation. The government has certain rights in the invention.

BACKGROUND

As a class of semiconductor data storage systems, flash memories areused in a variety of electronic devices, for example in music playersand solid-state disk drives. Multilevel cell (MLC) flash memories haverelatively low costs and high densities due to continuous improvementsin scaling technology and the fact that MLC memories store more than onebit per cell. Scaling technology continues to increase cell density,which in turn enhances intercell interference (ICI), especially in MLCmemories. Moreover, MLC technology narrows the width of the thresholdvoltage for each level and reduces the margins between adjacent levelsin a cell, which results in degradation of reliability. Thus, reliabledetection and encoding/decoding in flash memories is oftentimesdifficult.

SUMMARY OF THE INVENTION

In a first aspect, embodiments of the invention provide a method fordetermining decision metrics in a detector for a memory device. Themethod includes receiving a plurality of signal samples and extracting aset of statistics from the signal samples, wherein at least one of thestatistics is non-linear or complex, is derived from a plurality of thesignal samples, and is not a function of at least one real linearstatistic that is derived from a plurality of the signal samples. Themethod also includes applying at least one decision metric function tothe set of statistics to determine at least one decision metric valuecorresponding to at least one postulated symbol.

In another aspect, embodiments of the invention are directed to asystem. The system includes a memory channel and a detector incommunication with the memory channel. The detector is configured toreceive a plurality of signal samples; extract a set of statistics fromthe signal samples, wherein at least one of the statistics is non-linearor complex, is derived from a plurality of the signal samples, and isnot a function of at least one real linear statistic that is derivedfrom a plurality of the signal samples; and apply at least one decisionmetric function to the set of statistics to determine at least onedecision metric value corresponding to at least one postulated symbol.

In another aspect, embodiments of the invention provide a detector for amemory device. The detector includes a first circuit configured toreceive a plurality of signal samples. The detector also includes asecond circuit in communication with the first circuit, the secondcircuit configured to extract a set of statistics from the signalsamples, wherein at least one of the statistics is non-linear orcomplex, is derived from a plurality of the signal samples, and is not afunction of at least one real linear statistic that is derived from aplurality of the signal samples. The detector further includes a thirdcircuit in communication with the second circuit, the third circuitconfigured to apply at least one decision metric function to the set ofstatistics to determine at least one decision metric value correspondingto at least one postulated symbol.

In another aspect, embodiments of the invention provide an apparatus.The apparatus includes means for receiving a plurality of signal samplesand means for extracting a set of statistics from the signal samples,wherein at least one of the statistics is non-linear or complex, isderived from a plurality of the signal samples, and is not a function ofat least one real linear statistic that is derived from a plurality ofthe signal samples. The apparatus also includes means for applying atleast one decision metric function to the set of statistics to determineat least one decision metric value corresponding to at least onepostulated symbol.

In a further aspect, embodiments of the invention provide anon-transitory computer readable medium including software for receivinga plurality of signal samples; extracting a set of statistics from thesignal samples, wherein at least one of the statistics is non-linear orcomplex, is derived from a plurality of the signal samples, and is not afunction of at least one real linear statistic that is derived from aplurality of the signal samples; and applying at least one decisionmetric function to the set of statistics to determine at least onedecision metric value corresponding to at least one postulated symbol.

In another aspect, embodiments of the invention provide a method fordetermining decision metrics in a detector for a memory device. Themethod includes receiving a plurality of signal samples and computing aset of statistics, wherein at least one of the statistics is obtained byFIR filtering or IIR filtering of at least one squared signal sample.The method also includes applying at least one decision metric functionto the set of statistics to determine at least one decision metric valuecorresponding to at least one postulated symbol.

In a further aspect, embodiments of the invention provide a method fordetermining decision metrics in a detector for a memory device. Themethod includes receiving a plurality of signal samples and computing aset of statistics using a transformation of signal samples to obtain acharacteristic-function-like set of statistics. The method also includesapplying at least one decision metric function to the set of statisticsto determine at least one decision metric value corresponding to atleast one postulated symbol.

In another aspect, embodiments of the invention provide a method fordetermining decision metrics in a detector for a memory device. Themethod includes receiving a signal sample and at least one adjacentsignal sample and computing a set of at least one statistic, wherein theat least one statistic is obtained by nonlinearly processing the atleast one adjacent signal sample. The method also includes applying atleast one decision metric function to the at least one statistic todetermine at least one decision metric value corresponding to at leastone postulated symbol.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a block diagram of an embodiment of a memory system.

FIG. 2 illustrates an embodiment of a flash memory structure.

FIG. 3 illustrates an embodiment of a flash memory channel model.

FIG. 4 illustrates an embodiment of a branch metric computation using aFast Fourier Transform.

FIGS. 5-7 illustrate embodiments of lookup tables.

FIG. 8 illustrates an embodiment of a computation module.

FIG. 9 illustrates an embodiment of an even/odd bit line structure.

FIGS. 10-12 illustrate embodiments of computation modules.

FIG. 13 illustrates an embodiment of a process for determining decisionmetric values.

FIG. 14 illustrates a graph of a probability density function vs. athreshold voltage.

FIGS. 15 and 16 illustrate graphs of SIQ comparisons for differentdetectors.

DETAILED DESCRIPTION OF THE INVENTION

While the description herein generally refers to semiconductor memories,and various types of semiconductor memories such as MLC flash memories,it may be understood that the devices, systems and methods apply toother types of memory devices. The described embodiments of theinvention should not be considered as limiting.

Embodiments of the invention may be used with or incorporated in acomputer system that may be a standalone unit or include one or moreremote terminals or devices in communication with a central computer viaa network such as, for example, the Internet or an intranet. As such,the computer or “processor” and related components described herein maybe a portion of a local computer system or a remote computer or anon-line system or combinations thereof. As used herein, the term“processor” may include, for example, a computer processor, amicroprocessor, a microcontroller, a digital signal processor (DSP),circuitry residing on a memory device, or any other type of device thatmay perform the methods of embodiments of the invention.

Embodiments of the invention are directed generally to systems, methodsand devices that may be used to detect written symbols by observingchannel output values in a memory device, such as a semiconductor memorydevice, in the presence of intercell interference (ICI). Variousembodiments allow for improvements in detector hard decision bit-errorrates and detector soft decision quality.

Embodiments of the invention are described herein using channel models,including one-dimensional (1D) models with causal output memory andtwo-dimensional (2D) anti-causal models. Various embodiments aredescribed herein as a mathematically tractable Viterbi-like maximum aposterior (MAP) sequence detector for the 1D causal model with outputmemory. The statistics (sometimes referred to herein as “sufficientstatistics”) of the channel model that may be used to implement the MAPdetector may be obtained, for example, by using a fast Fourier transform(FFT). In various embodiments, a Gaussian approximation (GA) sequencedetector is presented. In various embodiments, the MAP detector and theGA detector may be used for 2D anti-causal memory channels.

FIG. 1 illustrates a block diagram of an embodiment of a memory system10. The memory system 10 is illustrated as an MLC flash memory system.It may be understood that the system 10 may be any type of memorysystem. The memory system 10 includes an encoder 12, a channel 14, adetector 16, and a decoder 18.

By way of example, a NAND flash memory is discussed hereinbelow. A NANDflash memory consists of cells, where each cell is a transistor with anextra polysilicon strip (i.e., the floating gate) between the controlgate and the device channel. By applying a voltage to the floating gate,a charge is maintained/stored in a cell. In order to store data in thecell of an MLC flash memory, a certain voltage (i.e., one that fallsinto one of multiple required voltage ranges) is applied to the cell.All memory cells are hierarchically organized in arrays, blocks and pagepartitions, as illustrated in FIG. 2. The smallest unit that can besimultaneously accessed for programming (writing) or reading is a page,and the smallest unit that can be erased is a block.

Incremental step pulse program (ISPP), also called theprogram-and-verify technique with a staircase, or iterative programming,is an iterative technique that can verify the amount of voltage carriedat each cell after each programming cycle. The ISPP approach provides aseries of verification pulses right after each program pulse.Consequently, the threshold voltage deviation of a programmed cell tendsto behave like a uniform random variable. As the programming of a cellis a one-way operation and because it is not possible to erase aspecific cell separately from other cells in a block, a memory cellshould be erased before programming. The distribution of the thresholdvoltage of an erased memory cell tends to be Gaussian.

An “even/odd bit-line structure” architecture may be used to program(write) data. Such an architecture separates all the cells into those ateven bit-lines and those at odd bit-lines. During the process ofprogramming, the cells at even bit-lines along a word-line are writtenat the same time instant, and then the cells at odd bit-lines along theword-line are written at the next time instant. An “all-bit-linestructure” architecture may likewise be used to program the data. Insuch an architecture, all cells along a word-line are writtensimultaneously without distinguishing between even and odd cells. Theeven/odd bit-line structure has the advantage that circuitry may beshared and reused, while the all-bit-line structure has the advantagethat the ICI tends to be lower.

As illustrated in FIG. 3, two sources of performance degradation thataffect the threshold voltage in each memory cell are program/erase (PE)cycling and ICI. The PE cycling process distorts the final thresholdvoltage of a transistor in two different ways. The first distortion isdue to the trapping and detrapping ability of the interface at thetransistor gate, which leads to a fluctuation of the final thresholdvoltage of the cell. The fluctuation may be modeled by a Gaussiandistribution with parameters dependent on the input voltages at aneighborhood of floating gates (i.e., signal-dependent noise) and thenumber of times that a cell has been programmed and erased. The seconddistortion arises when electrons are trapped in the interface area of acell, which causes degradation of the threshold voltage. This effect maybe exacerbated as the device undergoes many PE cycles.

ICI is a degradation that grows with density. As cells are packed closerto each other, the influence of threshold voltages from neighboringcells increases. In other words, due to the parasitic capacitancecoupling effects among the neighboring cells, the change in thethreshold voltage on one cell during the programming (charging) affectsthe final voltages of all the other cells (especially those cells thatwere already programmed). This disturbance may be modeled by a(truncated) Gaussian distribution whose parameters depend on thedistance between cells.

Although a flash memory channel is not one-dimensional, but rathertwo-dimensional (2D) because the channel is a page-oriented channel, forclarity the channel model may be presented as a one-dimensional (1D)causal channel model. Also, a flash memory channel is not causal, butrather anti-causal, because ICI is an anti-causal effect because onlythose cells that are programmed after the victim cell actually affectthe victim cell. The 1D causal channel model is useful in formulating anoptimal detector. As described hereinbelow, such a detector may beextrapolated to cover 2D anti-causal channel.

Let k ∈

stand for discrete time (in this case, position in the cell array). Thechannel input, denoted by X_(k), is the intended stored voltage amountin the k-th cell. The channel output denoted by Y_(k) is the channeloutput voltage corresponding to the input value X_(k). According to MLCtechnology, it may be assumed that the channel input random variableX_(k) takes value from a finite alphabet χ={v₀, v₁, . . . , v_(m−1)}with |χ|=m<∞. It may be assumed that the channel input and the channeloutput have the relation:

$\begin{matrix}{Y_{k} = {X_{k} + {\sum\limits_{l = 1}^{L}{\Gamma_{l}^{(k)}\left( {Y_{k - l} - E_{k - l}} \right)}} + W_{k} + U_{k}}} & (1)\end{matrix}$

where: E_(k) is the erase-state noise at the k-th cell, modeled as aGaussian random variable with mean μ_(e) and variance σ_(e) ², that is,E_(k)

(μ_(e), σ_(e) ²) is a fading-like coefficient that models causal ICIfrom the (k−l)-th cell towards the k-th cell (victim cell). We assumeΓ_(l) ^((k)) also to be a Gaussian random variable, Γ_(l) ^((k))˜

(γ_(l), g_(l)); L is the output memory, which implies that the currentchannel output Y_(k) is affected by its L neighbors Y_(k−l), Y_(k−2); .. . Y_(k−L); U_(k) denotes the programming noise resulting from usingthe ISPP method of programming the k-th cell of a certain word-line—thisnoise is modeled as a zero mean uniform random variable with width Δ,that is, U_(k)˜υ(−Δ/2, Δ/2); and W_(k) is observation noise due to thePE cycling, and is distributed as a zero mean Gaussian random variablewith variance σ_(w) ², that is, W_(k)˜

(0, σ_(w) ²).

In one embodiment it may be assumed that all random variables Γ_(l)^((k)), E_(k−l), W_(k) and U_(k) are mutually independent for all k andall l and it may be assumed that the PE cycling/aging effect isincorporated into the model through the knowledge of σ_(w) ². That is,σ_(w) ² may depend on the device age. It may be understood that allnoise sources and their parameters may be signal-dependent.

Detectors constructed according to various embodiments may cover anextended channel model that covers intersymbol interference (ISI) inaddition to intercell interference (ICI). Here, ISI denotes thedependence of the channel output on a neighborhood of intended writtensymbols (channel inputs) and ICI denotes the dependence of the channeloutput on a neighborhood of stored voltage values (channel outputs). LetX_(k) be the channel input at discrete time index k, which takes valuefrom a finite alphabete χ with |χ|<∞. Let Y_(k) be the channel outputcorresponding to the input X_(k). The following causal channel model maybe considered:

Y _(k)=Σ_(m=0) ^(M) A _(m) ^((k)) X _(k−m)+Σ_(l=1) ^(L) B _(l) ^((k))(Y_(k−l) +E _(k−l))+W _(k)   (1a)

where: A_(m) ^((k)) is Gaussian random variable

(α_(m), s_(m)); B_(l) ^((k)) is Gaussian random variable

(β_(l), g_(l)); E_(k−l) is Gaussian random variable

(0, σ_(E) ²); and W_(k) is Gaussian random variable

(0, σ_(W) ²).

Each one of the above random variables is independent of the same randomvariable at a different time index. Also, all random variables A_(m)^((k)), B_(l) ^((k)), E_(k−j), W_(k) are mutually independent. Inequation (1a), the term U_(k) is missing to illustrate that not allchannels may suffer from quantization (programming) noise. If thechannel does suffer from quantization noise, the term U_(k) may beincluded akin to equation (1).

It may be understood that the coefficients A and B in (1a) (or theparameters that describe their statistical behavior if the coefficientsare random variables) and the parameters of the noise sources E_(k),W_(k) and U_(k) may be signal-dependent. In that case, the detector mayexhibit signal-dependence. It may be understood that embodimentscontemplate detectors that exhibit signal-dependent features by choosingthe decision metric functions among a set of signal-dependent functions.

The sequence of random variables (X₁, X₂ . . . X_(n)) of length n may bedenoted by X₁ ^(n). The realization sequence (x₁, x₂ . . . x_(n)) may bedenoted by x₁ ^(n). The set of all possible realizations of the randomsequence X₁ ^(n) may be denoted by χ^(n).

Detecting the input realization sequence x₁ ^(n) (for n>0) from theoutput realization y₁ ^(n) of the above channel model is done asfollows. The maximum a posteriori (MAP) sequence detector of the statesequence x₁ ^(n) is the sequence {circumflex over (x)}₁ ^(n) thatmaximizes the joint conditional pdf:

$\begin{matrix}{{\hat{x}}_{1}^{n} = {\arg {\max\limits_{x_{1}^{n} \in X^{n}}{{f\left( {x_{1}^{n},{y_{1}^{n}x_{1 - M}^{0}},y_{1 - L}^{0}} \right)}.}}}} & (2)\end{matrix}$

As shorthand, f(x, y|i.c.) may be denoted as the conditional pdf of theright hand side of (2), where i.c. stands for initial condition (x_(1−m)⁰, y_(1−L) ⁰). In various embodiments, the initial condition is assumedto be known.

It may be assumed that the input sequence is a Markov process of orderM.

The pdf in (2) may be factored as:

$\begin{matrix}\begin{matrix}{{f\left( {\underset{\_}{x},{\underset{\_}{y}{i.c.}}} \right)} = {f\left( {x_{1}^{n},{y_{1}^{n}x_{1 - M}^{0}},y_{1 - L}^{0}} \right)}} \\{= {{P\left( {{x_{1}^{n}x_{1 - M}^{0}},y_{1 - L}^{0}} \right)}{f\left( {{y_{1}^{n}x_{1 - M}^{n}},y_{1 - L}^{0}} \right)}}} \\{= {\prod\limits_{j = 1}^{n}{\left( {{P\left( {x_{j}x_{j - M}^{j - 1}} \right)}{f\left( {{y_{j}x_{j - M}^{j}},y_{j - L}^{j - 1}} \right)}} \right).}}}\end{matrix} & (3)\end{matrix}$

Subsequently, the MAP detected sequence is equal to:

$\begin{matrix}{{\hat{x}}_{1}^{n} = {{argmin}_{x_{1}^{n} \in X^{n}}{\sum\limits_{j = 1}^{n}\left\lbrack \underset{\underset{{{{Branch}\mspace{14mu} {metric}\mspace{14mu} {\Lambda_{MAP}{({x_{j - M}^{j},y_{j - L}^{j}})}}}\rbrack} \cdot {(5)}}{}}{{- {\ln \left( {P\left( {x_{j}x_{j - M}^{j - 1}} \right)} \right)}} - {\ln \left( {f\left( {{y_{j}x_{j - M}^{j}},y_{j - L}^{j - 1}} \right)} \right)}} \right.}}} & (4)\end{matrix}$

Evaluating the branch metric Λ_(MAP)(•,•) may require evaluating theconditional pdf f(y_(j)|x_(j−M) ^(j), y_(j−L) ^(j−1)) or some functionthereof. The branch metric depends on L+1 real valued variables y_(j), .. . , y_(j−L). It may be desired to extract sufficient statistics (or asubset of sufficient statistics, referred to as “statistics”) fromy_(j−L) ^(j) that will allow efficient computation of branch metrics.

The channel model may be rewritten as:

$\begin{matrix}{Y_{k} = {\underset{\underset{R}{}}{{\sum\limits_{m = 0}^{M}{A_{m}^{(k)}X_{k - m}}} + W_{k}} + {\sum\limits_{l = 1}^{L}\underset{\underset{z_{l}}{}}{B_{l}^{(k)}\left( {Y_{k - l} + E_{k - l}} \right)}}}} & (5)\end{matrix}$

The conditional characteristic function of R and Z_(l) may be computedunder the assumptions that X_(k−M) ^(k)=x_(k−M) ^(k) and Y_(k−L)^(k−1)=y_(k−L) ^(k−1) are given. Note that if X_(k−M) ^(k=x) _(k−M) ^(k)is given, R is Gaussian

(μ_(R), σ_(R) ²) where:

$\begin{matrix}\begin{matrix}{\mu_{R} = {\left\lbrack {{RX_{k - M}^{k}} = x_{k - M}^{k}} \right\rbrack}} \\{= {\sum\limits_{m = 0}^{M}{\alpha_{m}x_{k - m}}}}\end{matrix} & (6) \\\begin{matrix}{\sigma_{R}^{2} = {\left\lbrack {{\left( {R - \mu_{R}} \right)^{2}X_{k - M}^{k}} = x_{k - M}^{k}} \right\rbrack}} \\{= {{\sum\limits_{m = 0}^{M}{s_{m}x_{k - m}^{2}}} + {\sigma_{W}^{2}.}}}\end{matrix} & (7)\end{matrix}$

Hence, the conditional characteristic function of R is given as:

$\begin{matrix}\begin{matrix}{{G_{RX_{k - M}^{k}}(t)} = {\left\lbrack {{^{\; {Rt}}X_{k - M}^{k}} = x_{k - M}^{k}} \right\rbrack}} \\{= {{\exp \left( {{{- \frac{1}{2}}\sigma_{R}^{2}t^{2}} + {{\mu}_{R}t}} \right)}.}}\end{matrix} & (8)\end{matrix}$

Similarly, as illustrated hereinbelow for a product of two independentGaussian variables, the conditional characteristic function of Z_(l) canbe computed when Y_(k−l) ^(k−1)=y_(k−l) ^(k−1) and X_(k−M) ^(k)=x_(k−M)^(k) are given, as:

$\begin{matrix}\begin{matrix}{{G_{{Z_{}|Y_{k - }^{k - 1}},X_{k - M}^{k}}(t)} = {\left\lbrack {{\left. ^{\; Z_{}t} \middle| Y_{k - }^{k - 1} \right. = y_{k - }^{k - 1}},{X_{k - M}^{k} = x_{k - M}}} \right\rbrack}} \\{= {\frac{1}{\sqrt{1 + {g_{}\sigma_{E}^{2}t^{2}}}}{\exp\left( \frac{\begin{matrix}{{- {t^{2}\left( {{y_{k - }^{2}g_{}} + {\beta_{}^{2}\sigma_{E}^{2}}} \right)}} +} \\{2{ity}_{k - }\beta_{}}\end{matrix}}{2\left( {1 + {g_{}\sigma_{E}^{2}t^{2}}} \right)} \right)}}}\end{matrix} & (9)\end{matrix}$

Combining (8) and (9), and utilizing the conditional independence (giveny_(k−l) ^(k−1) and x_(k−M) ^(k)), yields the characteristic function:

$\begin{matrix}\begin{matrix}{{G_{{Y_{k}|Y_{k - }^{k - 1}},X_{k - M}^{k}}(t)} = {\left\lbrack {{\left. ^{\; Y_{k}t} \middle| Y_{k - }^{k - 1} \right. = y_{k - }^{k - 1}},{X_{k - M}^{k} = x_{k - M}}} \right\rbrack}} \\{= {\frac{1}{\sqrt{\prod\limits_{ = 1}^{L}\left( {1 + {g_{}\sigma_{E}^{2}t^{2}}} \right)}}{\exp\left( {{{- \frac{1}{2}}\sigma_{R}^{2}t^{2}} + {{\mu}_{R}t} +} \right.}}} \\\left. {\sum\limits_{ = 1}^{L}\left\lbrack \frac{\begin{matrix}{{- {t^{2}\left( {{y_{k - }^{2}g_{}} + {\beta_{}^{2}\sigma_{E}^{2}}} \right)}} +} \\{2{ity}_{k - }\beta_{}}\end{matrix}}{2\left( {1 + {g_{}\sigma_{E}^{2}t^{2}}} \right)} \right\rbrack} \right)\end{matrix} & (10)\end{matrix}$

In a system, sampling the characteristic function at various values oft, yields a set of nonlinear complex (i.e., nonlinear) statisticsdependent on a plurality of signal samples.

Because the pdf is the Fourier transform of the characteristic function,the conditional probability f(y_(j)|x_(j−M) ^(j), y_(j−vL) ^(j−1)) canbe obtained as:

f(y _(j) |x _(j−M) ^(j) , y _(j−L) ^(j−1))=∫_(−∞) ^(∞) G _(Y) _(j) _(|X)_(j−M) _(j−1) (t)e ^(−iy) ^(j) ^(t) dt.   (11)

In practice, the above integral may be implemented using a Fast FourierTransform (FFT).

The branch metric Λ_(MAP)(x_(j−M) ^(j), y_(j−L) ^(j)) in (4) may benumerically computed for each branch in the Viterbi trellis using thefast Fourier transform (FFT). In various embodiments, for each branch inthe trellis an FFT may be computed. The FFT itself is a complex(non-real) linear statistic. In symbol-by-symbol detectors, the trellisstates deviate to a single state, and a branch metric deviates to asymbol-by-symbol decision metric. Hence, the term “decision metric”denotes either a branch metric in a Viterbi-like detector or a decisionmetric in a symbol-by-symbol detector.

It may be understood that the characteristic function and the pdf form atransform pair, where the elements of the pair are the FFT and theinverse FFT (iFFT) of each other. Embodiments disclosed herein are notlimited to only characteristic functions, but apply to all othercharacteristic-function-like transforms (and their inverses). Oneexample is the moment-generating function, which is the Laplacetransform of the pfd. Other examples may include wavelet transforms,z-transforms, etc. It may be understood that the characteristic functionembodiment contemplates all other characteristic-function-like transformembodiments.

For the special case in which σ_(W) ² does not depend on x_(j−M) ^(j)and s_(m)=0 for every m, one FFT may be computed for each trellissection (if the channel model contains ISI) or for each symbol (in asymbol-by-symbol fashion) if the channel model contains no ISI and doesnot require a trellis representation. Thus, the FFT is the same for allbranches of the trellis section, but the actual branch metric values(decision metric values) may be obtained by sampling the FFT atdifferent points.

The channel outputs y_(k−l) ^(k) may need to be processed in order toformulate the branch metrics. The processing complexity depends on theorder L. FIG. 4 illustrates a branch metric computation using the FFT.

Example: L=1: If L=1, then (17) reveals that a set of sufficientstatistics for the computation of branch metrics is: linear statisticy_(k); linear statistic β₁y_(k−1); and nonlinear statistic g₁y_(k−1) ².

In various embodiments, Λ_(MAP) may be obtained using a lookup table asillustrated in FIG. 5.

Example: L=2: If L=2, the exponent in (17) reveals that a set ofsufficient statistics is: linear statistic y_(k); linear statisticβ₁y_(k−1)+β₂y_(k−2); linear statistic β₁g₂y_(k−1)+β₂g₁y_(k−2); nonlinearstatistic g₁y_(k−1) ²+g₂y_(k−2) ²; and nonlinear statistic g₁g₂y_(k−1)²+g₁g₂y_(k−2) ².

Consequently, the branch metrics Λ_(MAP) may be computed using a lookuptable as illustrated in FIG. 6.

Example: L>2: Extrapolating from the previous two examples, in variousembodiments a set of sufficient statistics that solve this probleminvolve two types of finite impulse response (FIR) filters: one or moreFIR filters acting linearly on the signal y_(k); and one or more FIRfilters acting on the nonlinearly modified signal y_(k) ².

In various embodiments, a lookup table whose inputs are all thesufficient statistics may be too complicated to implement. In variousembodiments, a lookup table as shown in FIG. 7 may be used.

The following outlines a suboptimal detector based on the Gaussianapproximation according to various embodiments. According to (1), Y_(k)may be obtained as the summation of several random variables. Assumethat f(y_(j)|x_(j−M) ^(j), y_(j−L) ^(j−1)) may be approximated by aGaussian pdf as follows

f(y_(j)|x_(j−M) ^(j), y_(j−L) ^(j−1))˜

(μ_(G), σ_(G) ²),   (12)

where,

$\begin{matrix}\begin{matrix}{\mu_{G} = {\left\lbrack {{\left. Y_{j} \middle| Y_{j - }^{j - 1} \right. = y_{j - }^{j - 1}},{X_{j - M}^{j} = x_{j - M}^{j}}} \right\rbrack}} \\{= {{\sum\limits_{m = 1}^{M}{\alpha_{m}x_{j - m}}} + {\sum\limits_{ = 1}^{L}{\beta_{}y_{j - }}}}}\end{matrix} & (13) \\\begin{matrix}{\sigma_{G}^{2} = {{Var}\left\lbrack {{\left. Y_{j} \middle| Y_{j - }^{j - 1} \right. = y_{j - }^{j - 1}},{X_{j - M}^{j} = x_{j - M}^{j}}} \right\rbrack}} \\{= {{\sum\limits_{m = 1}^{M}{s_{m}x_{j - m}^{2}}} + {\sum\limits_{ = 1}^{L}\left( {{g_{}\sigma_{E}^{2}} + {g_{}y_{j - }^{2}} + {\sigma_{E}^{2}\beta_{}^{2}}} \right)} + \sigma_{W}^{2}}}\end{matrix} & (14)\end{matrix}$

Hence, using a similar procedure as used hereinabove, the Gaussian)approximation branch metrics Λ_(MAP) ^((G))(x_(j−M) ^(j), y_(j−L) ^(j))in (4) may be derived as:

$\begin{matrix}\begin{matrix}{{\Lambda_{MAP}^{(G)}\left( {x_{j - M}^{j},y_{j - L}^{j}} \right)} = {{- {\ln \left( {P\left( x_{j} \middle| x_{1 - M}^{j - 1} \right)} \right)}} - {\ln \left( {f\left( {\left. y_{i} \middle| x_{j - M}^{j} \right.,y_{j - L}^{j - 1}} \right)} \right)}}} \\{= {{- {\ln \left( {P\left( x_{j} \middle| x_{1 - M}^{j - 1} \right)} \right)}} + {\frac{1}{2}\left( {{\ln \left( {2{\pi\sigma}_{G}^{2}} \right)} + \frac{\left( {y_{j} - \mu_{G}} \right)^{2}}{\sigma_{G}^{2}}} \right)}}}\end{matrix} & (15)\end{matrix}$

The subset of sufficient statistics for computing Λ_(MAP) ^((G))(•,•)are:

ω_(j) = y_(j)$\theta_{j} = {\sum\limits_{ = 1}^{L}{\beta_{}y_{j - }}}$$\varphi_{j} = {\sum\limits_{ = 1}^{L}{g_{}{y_{j - }^{2}.}}}$

The first two statistics are linear and the third is nonlinear.

Hence, the computation of Λ_(MAP) ^((G))(x_(j−M) ^(j), y_(j−L) ^(j)) isequivalent to computing Λ_(MAP) ^((G))(x_(j−M) ^(j), ω_(j), θ_(j),φ_(j)). Thus, the entire vector of L+1 signal samples y_(j−L) ^(j) (seeFIG. 7) may be replaced by a new vector [ω_(j), θ_(j), φ_(j)] of onlythree statistics (even if L>2). Furthermore, in various embodiments, theactual computation of Λ_(MAP) ^((G))(x_(j−M) ^(j), ω_(j), θ_(j), φ_(j))does not require lookup tables, but may be implemented using, forexample, digital signal processing (DSP) components such as multipliersand adders. FIG. 8 illustrates an embodiment of a branch metriccomputation module of a GA detector using FIR filters.

As apparent from FIG. 8, the signal samples may be either mean-adjustedor non-mean adjusted. In various embodiments, the term “signal sample”may encompass any of the following examples: raw signal sample,mean-adjusted signal sample, pre-equalized signal sample, digitizedand/or quantized signal sample, etc.

If any of the channel coefficients' parameters α_(m), s_(m), β_(l),g_(l), σ_(E) ² and σ_(W) ² depend on the actual realization of thechannel input X_(k−M) ^(k), then a class of signal-dependent(pattern-dependent) detectors are indicated.

In various embodiments, a noise model (V_(k)) for the channel model (1)may be represented as:

$\begin{matrix}{Y_{k} = {{\sum\limits_{m = 0}^{M}{A_{m}^{(k)}X_{k - m}}} + {\sum\limits_{ = 1}^{L}{B_{}^{(k)}\left( {Y_{k - } + E_{k - }} \right)}} + V_{k}}} & (16)\end{matrix}$

The noise model not only has Thermal Gaussian noise (W_(k)), but it alsocontains the programming noise U_(k). The programming noise (akin toquantization error) may be modeled by uniform distribution, which isassumed to be independent in all other source of noises. Therefore, thenoise model (V_(k)) may be considered as:

V _(k) =W _(k) +U _(k)

where, W_(k) is the same Gaussian random variable and

(0, σ_(W) ²), and U_(k) is the Uniform random variable υ(o, Δ_(k)). Eachof random variables is independent of the same random variable at adifferent time index. Also, all random variables A_(m) ^((k)), E_(k−j),W_(k) and U_(k) are mutually independent.

By applying a similar FFT approach as discussed hereinabove, thecharacteristic function for the channel model may be calculated as:

$\begin{matrix}{\begin{matrix}{{G_{{Y_{k}|Y_{k - }^{k - 1}},X_{k - M}^{k}}(t)} = {\left\lbrack {{\left. ^{\; Y_{k}t} \middle| Y_{k - }^{k - 1} \right. = y_{k - }^{k - 1}},{X_{k - M}^{k} = x_{k - M}}} \right\rbrack}} \\{= {\frac{1}{\sqrt{\prod\limits_{ = 1}^{L}\left( {1 + {g_{}\sigma_{E}^{2}t^{2}}} \right)}}{\exp\left( {{{- \frac{1}{2}}\sigma_{R}^{2}t^{2}} +} \right.}}} \\{{{\left( {\mu_{R} + \frac{\Delta_{k}}{2}} \right)t} +}} \\{\left. {\sum\limits_{ = 1}^{L}\left\lbrack \frac{\begin{matrix}{{- {t^{2}\left( {{y_{k - }^{2}g_{}} + {\beta_{}^{2}\sigma_{E}^{2}}} \right)}} +} \\{2\; {ty}_{k - }\beta_{}}\end{matrix}}{2\left( {1 + {g_{}\sigma_{E}^{2}t^{2}}} \right)} \right\rbrack} \right){{Sinc}\left( \frac{\Delta_{k}t}{2} \right)}}\end{matrix}{{where},\text{}{{{Sinc}(\zeta)}\overset{\Delta}{=}{\frac{\sin (\zeta)}{\zeta}.}}}} & (17)\end{matrix}$

Note that if the quantization noise (programming noise) has no temporaldependence and no signal-dependence, Δ=Δ_(k) may be used.

The whole conditional distribution may be approximated as a Gaussiandistribution, and thus a suboptimal detector would be obtained asdiscussed hereinabove. The approximation may be modified by separatingthe major programming noise from other sources of noise in the model.Thus, the noise model V_(k) may be considered for the channel. Thechannel model may be rewritten as:

$\begin{matrix}\begin{matrix}{Y_{k} = {{\sum\limits_{m = 0}^{M}{A_{m}^{(k)}X_{k - m}}} + {\sum\limits_{ = 1}^{L}{B_{}^{(k)}\left( {Y_{k - } + E_{k - }} \right)}} + W_{k} + U_{k}}} \\{= {Z_{k} + {U_{k}.}}}\end{matrix} & (18)\end{matrix}$

The random variable Z_(k)|Y_(k−l) ^(k−1), X_(k−M) ^(k) may beapproximated as a Gaussian distribution

(μ_(G), σ_(G) ²) where μ_(G) and σ_(G) ² are derived in (14). Theconditional distribution is obtained by, for example, convolutionbetween the Gaussian probability distribution Z_(k)|Y_(k−l) ^(k−1),X_(k−M) ^(k) and the uniform distribution U_(k).

$\begin{matrix}\begin{matrix}{{f_{{Y_{k}|Y_{k - }^{k - 1}},X_{k - M}^{k}}\left( {\left. y_{k} \middle| y_{k - }^{k - 1} \right.,x_{k - M}^{k}} \right)} = {{f_{{Z_{k}|Y_{k - }^{k - 1}},X_{k - M}^{k}}\left( {\left. z_{k} \middle| y_{k - }^{k - 1} \right.,x_{k - M}^{k}} \right)}*}} \\{{f_{U_{k}}(u)}} \\{= {\int_{y_{k} - \Delta_{k}}^{y_{k}}{\frac{1}{\sqrt{2\pi}\sigma_{G}\Delta_{k}}e^{- \frac{{({z_{k} - \mu_{G}})}^{2}}{2\sigma_{G}^{2}}}{z_{k}}}}} \\{{= {\frac{1}{\Delta_{k}}\left( {{Q\left( \frac{y - \mu_{G} - \Delta_{k}}{\sigma_{G}} \right)} - {Q\left( \frac{y - \mu_{G}}{\sigma_{G}} \right)}} \right)}},}\end{matrix} & (19)\end{matrix}$

Where the standard Q-function is defined as

${Q(\zeta)} = {\frac{1}{\sqrt{2\pi}}{\int_{\zeta}^{\infty}{{\exp \left( \frac{- \zeta^{2}}{2} \right)}{{\zeta}.}}}}$

In two-dimensional (2D) page oriented memories with cell-to-cellinterference, a single cell is only affected by a finite anticausalneighborhood of nearby cells (which are programmed after the singlecell). In the case of multilevel flash memories with the even/oddbit-line structure and using the full-sequence programming strategy,cells in even bit lines, referred to as even cells, are programmed firstat one time instant, and then cells in odd bit lines, referred to as oddcells, are programmed at a later time instant. Hence, the neighborhoodsare also dependent on whether the even cell or the odd cell isprogrammed in the programming cycle. Let (k, l) denote the location of amemory cell, which means that the cell is located at the k-th word lineand the l-th bit line. The indices of the anticausal neighborhood forthe odd cell may be indicated by σ(k,l) and the indices of theanticausal neighborhood for the even cell may be indicated by ε(k,l), asillustrated in FIG. 9. That is:

ε_((k,l))

{(k+1, l−1), (k+1, l), (k+1, l+1)}  (20)

and

ε_((k,l))

{(k, l−1), (k, l+1)} ∪ σ_((k,l)).   (21)

The channel model for odd locations (the case when l is odd) is:

$\begin{matrix}{Y_{({k,})} = {X_{({k,})} + {\sum\limits_{{({m,n})} \in _{({k,})}}{\left( {Y_{({m,n})} + E_{({m,n})}} \right)B_{({m,n})}^{({k,})}}} + W_{({k,})}}} & (22)\end{matrix}$

and for even locations (the case when l is even):

$\begin{matrix}{Y_{({k,})} = {X_{({k,})} + {\sum\limits_{{({m,n})} \in ɛ_{({k,})}}{\left( {Y_{({m,n})} + E_{({m,n})}} \right)B_{({m,n})}^{({k,})}}} + {W_{({k,})}.}}} & (23)\end{matrix}$

If X_((k,l)) is a 2D i.i.d. process, in various embodiments the detectormay be implemented as discussed hereinabove (i.e., a trellis is notneeded). However, if X_((k,l)) is a process with 2D memory, an optimaldetector is not known (since a 2D equivalence of a Viterbi detector isnot available), and in various embodiments may be appropriatelyapproximated using adequate (and, in various embodiments, interleaved)1D Viterbi-like or symbol-by-symbol detectors.

FIG. 10 illustrates an embodiment in a 2-dimensional array. Theneighborhood of signal samples represents the cells that victimize aneven cell. It may be understood that similar representations may be madefor even neighborhoods, or for an all-bit line write structure). Theshown embodiment computes 3 statistics. Statistics 1 and 2 are linearstatistics (and further statistic 1 is a trivial statistic). Statistic 3is a nonlinear statistic. The arrows through some components indicatethat the components may be signal-dependent. This means that the exactmethod of computing the statistics and/or branch metrics may depend onthe postulated written signal (or a neighborhood of written signals). Ifthe channel has signal-dependent or signal-independent ISI (in additionto ICI), then the decision metric may be a trellis branch metric. If thechannel has no ISI (but only ICI), then the decision metric may be asymbol-by-symbol decision metric. This embodiment computes the twonon-trivial statistics using 2 FIR filters. The first filter(coefficients γ) operates on the (possibly mean adjusted) signal sampleswhile the second FIR filter (coefficients g) operates on the (possiblymean-adjusted) squares of the signal samples. The pdf computation blockis any pdf computation block. In one embodiment, the pdf block maycorrespond only to the Gaussian noise assumption. A second embodimentmay use the quantization noise assumption (resulting in Q-functionimplementations). The value A denotes the quantization noise step, whichitself may be signal-dependent. A third embodiment may consider, forexample, a hybrid assumption. The logarithmic block may be useful inhard decision detection. In soft-decision detection, the logarithmicblock may be omitted, or replaced by another block (such as, forexample, a likelihood-ratio computing block or a probability computingblock, depending on the type of soft information utilized). It may beunderstood that different types of soft or hard decisions will result inremoval or replacement of the logarithmic block by another and possiblymore suitable block.

FIG. 11 illustrates an embodiment in a 2-dimensional array. Theneighborhood of signal samples represents the cells that victimize aneven cell. It may be understood that similar representations may be madefor even neighborhoods, or for an all-bit line write structure. Theshown embodiment computes N nonlinear statistics. Each statistic is asample of the characteristic function (which itself is a non-real,complex quantity). Though no arrows through components are shown, it maybe understood that the components may be signal-dependent. This meansthat the exact method of computing the statistics and/or branch metricsmay depend on the postulated written signal (or a neighborhood ofwritten signals). If the channel has signal-dependent orsignal-independent ISI (in addition to ICI), then the decision metricmay be a trellis branch metric. If the channel has no ISI (but onlyICI), then the decision metric may be a symbol-by-symbol decisionmetric. The processing unit block may be fine-tuned to account for anyconditional pdf (i.e., any noise pdf and or ICI coefficient pdf). In oneembodiment, the processing unit may correspond only to the Gaussiannoise assumption. A second embodiment may use the quantization noiseassumption (resulting in Q-function implementations). A third embodimentmay consider a hybrid assumption. The processing unit may include alogarithmic sub-block depending on whether hard decisions or softdecisions are desired. A logarithmic sub-block may be useful in harddecision detection. In soft-decision detection, the logarithmicsub-block may be omitted, or replaced by another sub-block (such as, forexample, a likelihood-ratio computing sub-block or a probabilitycomputing sub-block, depending on the type of soft informationutilized). It may be understood that different types of soft or harddecisions will result in removal or inclusion of the logarithmic (oranother) sub-block into the processing unit.

FIG. 12 illustrates an embodiment in a 2-dimensional array. Theneighborhood of signal samples represents the cells that victimize aneven cell, as well as the victim cell itself. It may be understood thatsimilar representations may be made for even neighborhoods, or for anall-bit line write structure. The shown embodiment computes a set ofnonlinear statistics by combining the characteristic function and theFFT into one computational unit. Though no arrows through components areshown, it may be understood that the components may be signal-dependent.This means that the exact method of computing the statistics and/orbranch metrics may depend on the postulated written signal (or aneighborhood of written signals). If the channel has signal-dependent orsignal-independent ISI (in addition to ICI), then the decision metricmay be a trellis branch metric. If the channel has no ISI (but onlyICI), then the decision metric may be a symbol-by-symbol decisionmetric. The processing unit block may be fine-tuned to account for anyconditional pdf (i.e., any noise pdf and or ICI coefficient pdf). In oneembodiment, the processing unit may correspond only to the Gaussiannoise assumption. A second embodiment may use the quantization noiseassumption (resulting in Q-function implementations). A third embodimentmay consider, for example, a hybrid assumption. The processing unit mayinclude a logarithmic sub-block depending on whether hard decisions orsoft decisions are desired. A logarithmic sub-block may be useful inhard decision detection. In soft-decision detection, the logarithmicsub-block may be omitted, or replaced by another sub-block (such as, forexample, a likelihood-ratio computing sub-block or a probabilitycomputing sub-block, depending on the type of soft informationutilized). It may be understood that different types of soft or harddecisions will result in removal or inclusion of the logarithmic (oranother) sub-block into the processing unit. FIG. 12 shows one of theinputs into the processing unit as being P(x_(k,1)|x_(k,1), . . . ,X_(k+a,l+b)). In various embodiments, this may be the actual a-prioriprobability of input symbols, or it may be the a-posteriori probabilityprovided by a soft decoder (in either an iterative or a non-iterativearchitecture).

Derivation and Description of Decision Statistics

Reverting back to the 1-dimensional signals, under the assumption thatY_(k−l) ^(k−1)=y_(k−l) ^(k−1) and X_(k−M) ^(k)−x_(k−M) ^(k) are given,Z_(l) is the product of two Gaussian random variables which may berewritten as:

Z_(l)=B_(l)Γ_(l)   (24)

Where Γ_(l)˜

(y_(k−l), σ_(E) ²). It may be assumed B_(l)=B_(l) ^((k)) to simplify thenotation. Then, the characteristic function for the product of twonormal random variables B_(l)Γ_(l) may be computed as:

$\begin{matrix}\begin{matrix}{{G_{{Z_{}|Y_{k - }^{k}},X_{k - M}^{k}}(t)} = {\left\lbrack ^{\; B_{}\Gamma_{}t} \right\rbrack}} \\{= {\left\lbrack {\left\lbrack ^{\; B_{}\Gamma_{}t} \middle| \Gamma_{} \right\rbrack} \right\rbrack}} \\{= {\left\lbrack ^{{{\beta}_{}\Gamma_{}t} - {\frac{1}{2}g_{}\Gamma_{}^{2}t^{2}}} \right\rbrack}} \\{= {\frac{1}{\sqrt{2{\pi\sigma}_{E}^{2}}}{\int_{–\infty}^{\infty}{^{({{{\beta}_{}\gamma \; t} - {\frac{1}{2}g_{}\gamma^{2}t^{2}}})}^{- \frac{{({\gamma - y_{k - }})}^{2}}{2\sigma_{E}^{2}}}{\gamma}}}}} \\{= {\frac{1}{\sqrt{2{\pi\sigma}_{E}^{2}}}{\int_{- \infty}^{\infty}{\exp\left( {{- \frac{\left\lbrack {\gamma - \frac{y_{k - } + {\; t\; \beta_{}\sigma_{E}^{2}}}{1 + {t^{2}g_{}\sigma_{E}^{2}}}} \right\rbrack^{2}}{\frac{2\sigma_{E}^{2}}{1 + {t^{2}g_{}\sigma_{E}^{2}}}}} -} \right.}}}} \\{\left. \frac{{t^{2}\left( {{y_{k - }^{2}g_{}} + {\beta_{}^{2}\sigma_{E}^{2}}} \right)} - {2\; {ty}_{k - }\beta_{}}}{2\left( {1 + {g_{}\sigma_{E}^{2}t^{2}}} \right)} \right){\gamma}} \\{= \frac{1}{\sqrt{1 + {g_{}\sigma_{E}^{2}t^{2}}}}} \\{{\exp \left( \frac{{- {t^{2}\left( {{y_{k - }^{2}g_{}} + {\beta_{}^{2}\sigma_{E}^{2}}} \right)}} + {2\; {ty}_{k - }\beta_{}}}{2\left( {1 + {g_{}\sigma_{E}^{2}t^{2}}} \right)} \right)}}\end{matrix} & (25)\end{matrix}$

FIG. 13 illustrates an embodiment of a process for determining decisionmetric values. At step 1010, signal samples are received. In variousembodiments, the signal samples may be, for example, unconditionedsignal samples, mean adjusted signal samples, pre-equalized signalsamples, digitized signal samples and/or quantized signal samples. Atstep 1012, at least one set of statistics is computed using the signalsamples. In various embodiments, at least one of the statistics in theset is not a real linear statistic. At step 1014, the statistics areused as arguments of decision metric functions and at step 1016 thedecision metrics are evaluated. At step 1018 the decision metrics areused (e.g., by a symbol-by-symbol based detector, a trellis-baseddetector, etc.) to compute hard and/or soft decisions. At step 1020, thedecisions are passed to, for example, a decoder. At step 1022, theprocess continues for a different location of memory or different signalsamples that are loaded by, for example, updating the content of a shiftregister to load the different signal samples.

The following illustrates examples of statistics derived from aplurality of signal samples in order to delineate those statistics thatare applicable and those that are not applicable to various embodiments.It may be understood that these are only examples and by no meansrepresent an exhaustive list.

I) Examples of real linear statistics extracted from a plurality ofsignal samples:

a) y_(k)−a(y_(k−1)−μ_(e))−b(y_(k−2)−μ_(e))

-   -   where a and b are real coefficients; and y_(k), (y_(k−1)−μ_(e))        and (y_(k−2)−μ_(e)) are either signal samples or mean-adjusted        signal samples (the mean being μ_(e)). This is an example of a        statistic derived from signals in 1 dimension.

b) y_(k,l)−a(y_(k−1,l−1)−μ_(e))−b(y_(k−2,l+1))

where a and b are real coefficients. This is an example of a statisticderived from signals in 2 dimensions.

II) Examples of nonlinear statistics that are actually functions oflinear statistics extracted from a plurality of signal samples:

a) [y_(k)−a(y_(k−1)−μ_(e))−b(y_(k−2)−μ_(e))]²

-   -   where a and b are real coefficients; and y_(k), (y_(k−1)−μ_(e))        and (y_(k−2)−μ_(e)) are either signal samples or mean-adjusted        signal samples (the mean being μ_(e)). This is an example of a        statistic derived from signals in 1 dimension. This is a simple        square of the linear statistic given in I)-a), so it is a        function of a single linear statistic derived from a plurality        of signal samples.

b) log [y_(k,l)−a(y_(k−1,l−1)−μ_(e))=b(y_(k−2,l+1))]

-   -   where a and b are real coefficients. This is an example of a        statistic derived from signals in 2 dimensions. It is a simple        logarithm of the linear statistic given in I)-b), so it is a        function of a single linear statistic.

c)[y_(k)−a(y_(k−1)−μ_(e))−b(y_(k−2)−μ_(e))]²+[y_(k)−A(y_(k−1)−μ_(e))−B(y_(k−2)−μ_(e))]²

-   -   where a, b, A and B are real coefficients; and y_(k),        (y_(k−1)−μ_(e)) and (y_(k−2)−μ_(e)) are either signal samples or        mean-adjusted signal samples (the mean being μ_(e)). This is an        example of a statistic obtained as a function of 2 linear        statistics each one of which is already a linear statistic        derived from a plurality of signal samples. So, it is a function        of previously derived linear statistics involving a plurality of        signal samples.

III) Examples of complex linear statistics derived from plurality ofsignal samples:

a) y_(k)−a(y_(k−1)−μ_(e))−b(y_(k−2)−μ_(e))

-   -   where a and b are complex (non-real) coefficients; and y_(k),        (y_(k−1)−μ_(e)) and (y_(k−2)−μ_(e)) are either signal samples or        mean-adjusted signal samples (the mean being μ_(e)). A discrete        Fourier transform is an example of a complex linear sufficient        statistic.

b) y_(k,l)−a(y_(k−1,l−1)−μ_(e))−b(y_(k−2,l+1))

-   -   where a and b are complex (non-real) coefficients. This is an        example of a statistic derived from a signal in 2 dimensions.

IV) Examples of a genuine nonlinear statistic derived from a pluralityof signal samples that is not a simple function of a linear statistic(derived from a plurality of signal samples):

a) y_(k)−a(y_(k−1)−μ_(e))²−b(y_(k−2)−μ_(e))²

-   -   where a and b are complex or coefficients; and y_(k),        (y_(k−1)−μ_(e)) and (y_(k−2)−[2 _(e))are either signal samples        or mean-adjusted signal samples (the mean being μ_(e)). Because        of the few “squared signal” terms, the statistic is nonlinear.        Note that this statistic cannot be written as a function of a        liner statistic (derived from a plurality of signal samples), so        it is genuinely nonlinear.

${\left. b \right)\mspace{14mu} \frac{y_{k,}}{b\left( y_{{k - 2},{ + 1}} \right)}} + \frac{y_{k,}}{a\left( {y_{{k - 1},{ - 1}} - \mu_{e}} \right)}$

-   -   where a and b are complex or real coefficients. This is a        genuine non-linear statistic (derived from a plurality of signal        samples) because it cannot be obtained as a function of a single        linear statistic (obtained from a plurality of signal samples)        or multiple linear statistics (obtained from a plurality of        signal samples).

The characteristic function

c) G_(Y) _(k) _(|Y) _(k−1) _(,Y) _(k−2) _(,X) _(k) (t|y_(k−1), y_(k−2),x_(k))

-   -   sampled at an arbitrary value t because the general form cannot        be written as a function of a single linear statistic (obtained        from a plurality of signal samples) or multiple linear        statistics (obtained from a plurality of signal samples).

A detector constructed according to various embodiments operates oncomplex linear statistics derived from signal samples and genuinenonlinear statistics that are not simple functions of linear statistics,examples of which are illustrated hereinabove at III and IV.

Performance Curves

Simulations were performed of various embodiments of the methods andsystems herein using an even/odd bit-line structure. A 4-level flashmemory channel was selected, where the channel input X_(k) is an i.i.d.process with parameters Pr(X_(k)=v_(j))=0.25 for any of the 4 levels v₀,v₁, v₂, or v₃. The parameters of the 4-level flash memory (2D channel)with signal-dependent noise are given in Table 1. With the parameters asin Table 1, and using σ=1, FIG. 14 illustrates the pdf of each level'svoltage when no ICI occurs.

TABLE I PARAMETERS OF THE 4-LEVEL FLASH MEMORY i 0 1 2 3 ith level v_(i)1.1 2.7 3.3 3.9 Δ (i) 0 0.3 0.3 0.3 σ_(w) (i) 0.35σ 0.03σ 0.03σ 0.03σ

It was assumed that the random coupling ratios Γ_((a,b)) ^((k,l)) havethe following Gaussian distributions:

Γ_((k,l−1)) ^((k,l))˜

(γ_(h), g_(h)), Γ_((k,l+1)) ^((k,l))˜

(γ_(h), g_(h)), Γ_((k+1,l−1)) ^((k,l))˜

(γ_(d), g_(d)), Γ_((k+1,l+1)) ^((k,l))˜

(γ_(d), g_(d)) Γ_((k+1,l)) ^((k,l))˜

(γ_(v), g_(v)),   (26)

where the subscripts h, v and d mean horizontal, vertical and diagonalinterference, respectively. It was also assumed that:γ_(h):γ_(v):γ_(d)=0.1:0.08:0.006 and g_(i)=0.09γ_(i) ² for i ∈ {h, v,d}.Let s be the intercell coupling strength factor. Then γ_(h)=0.1 s,γ_(v)=0.08 s and γ_(d)=0.006 s.

In a first simulation scenario, σ=1 and the coupling strength factor svaried from 0 to 2.

In a second simulation scenario, s=0.75, and the parameter a was varied(see Table 1). By varying σ, the signal-to-noise ratio (SNR) was varied,defined as:

$\begin{matrix}{{SNR}\overset{\Delta}{=}{\frac{1}{\sum\limits_{i}{{\Pr \left( {X_{k} = \upsilon_{i}} \right)}{\sigma_{w}^{2}()}}}.}} & (27)\end{matrix}$

SIQ is the capacity of random linear block codes, which is proven to bethe highest information rate achievable by a random low-densityparity-check (LDPC) error correction code. Furthermore, the SIQ allows acomparison of performances of codes without going through thecomplicated task of simulating the actual codes. For example, if SIQ ofdetector A is 0.5 dB better than the SIQ of detector B, then a randomLDPC code using outputs from detector A will outperform the same randomLDPC code using outputs from detector B by 0.5 dB. In other words, ifdetector A is used, a 0.5 dB weaker code may be used while achieving thesame overall system performance.

The mutual information terms in may be computed numerically usingMonte-Carlo simulations for any detector (also for a hard-decisiondetector). For the special case of a MAP detector, the soft-informationquality q_(MAP) has an alternative interpretation, i.e., q_(MAP) isequal to the BCJR-once bound. FIG. 15 shows the soft informationqualities of the MAP detector and the GA detector when the couplingstrength factor s varies for fixed SNR. FIG. 16 shows the softinformation quality curves when the SNR varies for fixed s=0.75. Alsoshown in FIGS. 15 and 16 are soft information qualities of thepost-compensation detector and the raw detector. FIGS. 15 and 16 alsoshow an upper bound on the soft information quality of a soft-outputdetector, denoted by q*_(Dong). At SIQ=1.8 bits per cell (whichcorresponds to a code rate of 0.9 user bits per channel bit), the MAPdetector outperforms known detectors by 0.35 dB, as shown in FIG. 16.

As observable in FIG. 15, the Gaussian approximation (GA-MAP) detector(the FIR filter embodiment that utilizes statistics derived by squaringthe signal samples) betters the purely linear statistics embodiment(denoted by “postcomp” in FIG. 15) by roughly a 20% in terms of ICItolerance capability. Further, the figure reveals that the GA-MAPdetector has roughly a 100% larger ICI tolerance when compared to the“raw” detector (i.e., a simple slicer that does not consider a pluralityof signal samples in the decision metric). FIG. 15 also shows that thecharacteristic functions statistics (CF) embodiment (MAP) has a roughly60% better ICI tolerance than the “postcomp” (linear) detector, and aroughly 170% better ICI tolerance than the “raw” (slicer) detector.

As shown in FIG. 16, the performance of MAP detectors is better thanother detectors. The variable N in the MAP detector is the number ofquantization points in computing the FFT (i.e, N is the support lengthof the FFT) (In the simulations, N=512 was used).

In another aspect, the invention may be implemented as a non-transitorycomputer readable medium containing software for causing a computer orcomputer system to perform the method described above. The software mayinclude various modules that are used to enable a processor and a userinterface to perform the methods described herein.

It will be readily appreciated by those skilled in the art thatmodifications may be made to the invention without departing from theconcepts disclosed in the forgoing description. Accordingly, theparticular embodiments described in detail herein are illustrative onlyand are not limiting to the scope of the invention.

What is claimed is:
 1. A method for determining decision metrics in adetector for a memory device, the method comprising: receiving aplurality of signal samples; extracting a set of statistics from thesignal samples, wherein at least one of the statistics is non-linear orcomplex, is derived from a plurality of the signal samples, and is not afunction of at least one real linear statistic that is derived from aplurality of the signal samples; and applying at least one decisionmetric function to the set of statistics to determine at least onedecision metric value corresponding to at least one postulated symbol.2. The method of claim 1, wherein extracting a set of statisticsincludes at least one of filtering quadratic signals, filtering cubedsignals, computing a characteristic function, and computing a fastFourier transform (FFT).
 3. The method of claim 1, wherein applying atleast one decision metric function includes applying at least one of asymbol by symbol metric function and a branch metric function.
 4. Themethod of claim 2, wherein applying at least one decision metricfunction includes applying at least one of a symbol by symbol metricfunction and a branch metric function.
 5. A system, comprising: a memorychannel; and a detector in communication with the memory channel, thedetector configured to: receive a plurality of signal samples; extract aset of statistics from the signal samples, wherein at least one of thestatistics is non-linear or complex, is derived from a plurality of thesignal samples, and is not a function of at least one real linearstatistic that is derived from a plurality of the signal samples; andapply at least one decision metric function to the set of statistics todetermine at least one decision metric value corresponding to at leastone postulated symbol.
 6. The system of claim 5, further comprising: anencoder in communication with the memory channel; and a decoder incommunication with the detector.
 7. The system of claim 5, wherein thedetector is one of a hard decision detector and a soft decisiondetector.
 8. The system of claim 5, wherein the detector is one of asymbol-by-symbol detector and a Viterbi-like detector.
 9. The system ofclaim 5, wherein the detector is configured to extract a set ofstatistics from the signal samples by at least one of filteringquadratic signals, filtering cubed signals, computing a characteristicfunction, and computing a fast Fourier transform (FFT).
 10. A detectorfor a memory device, the detector comprising: a first circuit configuredto receive a plurality of signal samples; a second circuit incommunication with the first circuit, the second circuit configured toextract a set of statistics from the signal samples, wherein at leastone of the statistics is non-linear or complex, is derived from aplurality of the signal samples, and is not a function of at least onereal linear statistic that is derived from a plurality of the signalsamples; and a third circuit in communication with the second circuit,the third circuit configured to apply at least one decision metricfunction to the set of statistics to determine at least one decisionmetric value corresponding to at least one postulated symbol.
 11. Thedetector of claim 10, wherein the second circuit is further configuredto at least filter quadratic signals, filter cubed signals, compute acharacteristic function, and compute a fast Fourier transform (FFT). 12.An apparatus, comprising: means for receiving a plurality of signalsamples; means for extracting a set of statistics from the signalsamples, wherein at least one of the statistics is non-linear orcomplex, is derived from a plurality of the signal samples, and is not afunction of at least one real linear statistic that is derived from aplurality of the signal samples; and means for applying at least onedecision metric function to the set of statistics to determine at leastone decision metric value corresponding to at least one postulatedsymbol.
 13. A non-transitory computer readable medium including softwarefor: receiving a plurality of signal samples; extracting a set ofstatistics from the signal samples, wherein at least one of thestatistics is non-linear or complex, is derived from a plurality of thesignal samples, and is not a function of at least one real linearstatistic that is derived from a plurality of the signal samples; andapplying at least one decision metric function to the set of statisticsto determine at least one decision metric value corresponding to atleast one postulated symbol.
 14. A method for determining decisionmetrics in a detector for a memory device, the method comprising:receiving a plurality of signal samples; computing a set of statistics,wherein at least one of the statistics is obtained by FIR filtering orIIR filtering of at least one squared signal sample; and applying atleast one decision metric function to the set of statistics to determineat least one decision metric value corresponding to at least onepostulated symbol.
 15. The method of claim 14, wherein applying at leastone decision metric function includes applying at least one of a symbolby symbol metric function and a branch metric function.
 16. A method fordetermining decision metrics in a detector for a memory device, themethod comprising: receiving a plurality of signal samples; computing aset of statistics using a transformation of signal samples to obtain acharacteristic-function-like set of statistics; and applying at leastone decision metric function to the set of statistics to determine atleast one decision metric value corresponding to at least one postulatedsymbol.
 17. The method of claim 16, wherein the decision metric functionincludes a fast Fourier transform (FFT) calculation.
 18. The method ofclaim 16, wherein applying at least one decision metric functionincludes applying at least one of a symbol by symbol metric function anda branch metric function.
 19. A method for determining decision metricsin a detector for a memory device, the method comprising: receiving asignal sample and at least one adjacent signal sample; computing a setof at least one statistic, wherein the at least one statistic isobtained by nonlinearly processing the at least one adjacent signalsample; and applying at least one decision metric function to the atleast one statistic to determine at least one decision metric valuecorresponding to at least one postulated symbol.
 20. The method of claim19, wherein the at least one statistic is obtained by squaring the atleast one adjacent signal sample.
 21. The method of claim 19, whereinthe at least one statistic is obtained by computing the characteristicfunction of the at least one adjacent signal sample.
 22. The method ofclaim 19, wherein applying at least one decision metric functioncomprises applying at least one of a symbol-by-symbol function and abranch metric function.
 23. The method of claim 20, wherein applying atleast one decision metric function comprises applying at least one of asymbol-by-symbol function and a branch metric function.
 24. The methodof claim 21, wherein applying at least one decision metric functioncomprises applying at least one of a symbol-by-symbol function and abranch metric function.